NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Leveraging family data to design Mendelian Randomization that is provably robust to population stratification

https://doi.org/10.1101/gr.277664.123

LaPierre, Nathan; Fu, Boyang; Turnbull, Steven; Eskin, Eleazar; Sankararaman, Sriram (May 2023, Genome Research)

Mendelian Randomization (MR) has emerged as a powerful approach to leverage genetic instruments to infer causality between pairs of traits in observational studies. However, the results of such studies are susceptible to biases due to weak instruments as well as the confounding effects of population stratification and horizontal pleiotropy. Here, we show that family data can be leveraged to design MR tests that are provably robust to confounding from population stratification, assortative mating, and dynastic effects. We demonstrate in simulations that our approach, MR-Twin, is robust to confounding from population stratification and is not affected by weak instrument bias, while standard MR methods yield inflated false positive rates. We then conducted an exploratory analysis of MR-Twin and other MR methods applied to 121 trait pairs in the UK Biobank dataset. Our results suggest that confounding from population stratification can lead to false positives for existing MR methods, while MR-Twin is immune to this type of confounding, and that MR-Twin can help assess whether traditional approaches may be inflated due to confounding from population stratification.
more » « less
Full Text Available
Robust Mendelian randomization in the presence of residual population stratification, batch effects and horizontal pleiotropy

https://doi.org/10.1038/s41467-022-28553-9

Cinelli, Carlos; LaPierre, Nathan; Hill, Brian L.; Sankararaman, Sriram; Eskin, Eleazar (December 2022, Nature Communications)

Abstract Mendelian Randomization (MR) studies are threatened by population stratification, batch effects, and horizontal pleiotropy. Although a variety of methods have been proposed to mitigate those problems, residual biases may still remain, leading to highly statistically significant false positives in large databases. Here we describe a suite of sensitivity analysis tools that enables investigators to quantify the robustness of their findings against such validity threats. Specifically, we propose the routine reporting of sensitivity statistics that reveal the minimal strength of violations necessary to explain away the MR results. We further provide intuitive displays of the robustness of the MR estimate to any degree of violation, and formal bounds on the worst-case bias caused by violations multiple times stronger than observed variables. We demonstrate how these tools can aid researchers in distinguishing robust from fragile findings by examining the effect of body mass index on diastolic blood pressure and Townsend deprivation index.
more » « less
Full Text Available
Identifying causal variants by fine mapping across multiple studies

https://doi.org/10.1371/journal.pgen.1009733

LaPierre, Nathan; Taraszka, Kodi; Huang, Helen; He, Rosemary; Hormozdiari, Farhad; Eskin, Eleazar (September 2021, PLOS Genetics)
Zeggini, Eleftheria (Ed.)
Increasingly large Genome-Wide Association Studies (GWAS) have yielded numerous variants associated with many complex traits, motivating the development of “fine mapping” methods to identify which of the associated variants are causal. Additionally, GWAS of the same trait for different populations are increasingly available, raising the possibility of refining fine mapping results further by leveraging different linkage disequilibrium (LD) structures across studies. Here, we introduce multiple study causal variants identification in associated regions (MsCAVIAR), a method that extends the popular CAVIAR fine mapping framework to a multiple study setting using a random effects model. MsCAVIAR only requires summary statistics and LD as input, accounts for uncertainty in association statistics using a multivariate normal model, allows for multiple causal variants at a locus, and explicitly models the possibility of different SNP effect sizes in different populations. We demonstrate the efficacy of MsCAVIAR in both a simulation study and a trans-ethnic, trans-biobank fine mapping analysis of High Density Lipoprotein (HDL).
more » « less
Full Text Available
Metalign: efficient alignment-based metagenomic profiling via containment min hash

https://doi.org/10.1186/s13059-020-02159-0

LaPierre, Nathan; Alser, Mohammed; Eskin, Eleazar; Koslicki, David; Mangul, Serghei (December 2020, Genome Biology)
null (Ed.)
Abstract Metagenomic profiling, predicting the presence and relative abundances of microbes in a sample, is a critical first step in microbiome analysis. Alignment-based approaches are often considered accurate yet computationally infeasible. Here, we present a novel method, Metalign, that performs efficient and accurate alignment-based metagenomic profiling. We use a novel containment min hash approach to pre-filter the reference database prior to alignment and then process both uniquely aligned and multi-aligned reads to produce accurate abundance estimates. In performance evaluations on both real and simulated datasets, Metalign is the only method evaluated that maintained high performance and competitive running time across all datasets.
more » « less
Full Text Available
MetaPheno: A critical evaluation of deep learning and machine learning in metagenome-based disease prediction

https://doi.org/10.1016/j.ymeth.2019.03.003

LaPierre, Nathan; Ju, Chelsea J.-T.; Zhou, Guangyu; Wang, Wei (March 2019, Methods)

Full Text Available
MiCoP: microbial community profiling method for detecting viral and fungal organisms in metagenomic samples

https://doi.org/10.1186/s12864-019-5699-9

LaPierre, Nathan; Mangul, Serghei; Alser, Mohammed; Mandric, Igor; Wu, Nicholas C.; Koslicki, David; Eskin, Eleazar (June 2019, BMC Genomics)

Full Text Available
Critical Assessment of Metagenome Interpretation: the second round of challenges

https://doi.org/10.1038/s41592-022-01431-4

Meyer, Fernando; Fritz, Adrian; Deng, Zhi-Luo; Koslicki, David; Lesker, Till Robin; Gurevich, Alexey; Robertson, Gary; Alser, Mohammed; Antipov, Dmitry; Beghini, Francesco; et al (April 2022, Nature Methods)

Abstract Evaluating metagenomic software is key for optimizing metagenome interpretation and focus of the Initiative for the Critical Assessment of Metagenome Interpretation (CAMI). The CAMI II challenge engaged the community to assess methods on realistic and complex datasets with long- and short-read sequences, created computationally from around 1,700 new and known genomes, as well as 600 new plasmids and viruses. Here we analyze 5,002 results by 76 program versions. Substantial improvements were seen in assembly, some due to long-read data. Related strains still were challenging for assembly and genome recovery through binning, as was assembly quality for the latter. Profilers markedly matured, with taxon profilers and binners excelling at higher bacterial ranks, but underperforming for viruses and Archaea. Clinical pathogen detection results revealed a need to improve reproducibility. Runtime and memory usage analyses identified efficient programs, including top performers with other metrics. The results identify challenges and guide researchers in selecting methods for analyses.
more » « less
Full Text Available

Search for: All records